Skip to content

feat!: make voice receivers return AudioPacket classes instead of just Audio#11432

Open
petergeneric wants to merge 12 commits intodiscordjs:mainfrom
petergeneric:voice-rtp-headers
Open

feat!: make voice receivers return AudioPacket classes instead of just Audio#11432
petergeneric wants to merge 12 commits intodiscordjs:mainfrom
petergeneric:voice-rtp-headers

Conversation

@petergeneric
Copy link

Currently, RTP packet headers are stripped and clients are only delivered Opus frames as Buffers. The lack of timestamp makes jitter/drift hard to combat for clients, who must compute wall-clock delivery time.

This change introduces an AudioPacket interface which extends Buffer with the following read-only fields:

  • sequence (16-bit uint monotonically increasing counter to allow for identifying out-of-order RTP packets)
  • timestamp (32-bit uint that counts encoder-side timestamps; RFC 7587, Opus in RTP requires this be expressed at 48kHz no matter the audio sample rate)
  • ssrc (to allow consumers to detect a change and reset their Opus decoder state; updates on this value can already be received via SSRCMap events, however I think it also belongs on AudioPacket because there's a risk of delayed SSRCMap event delivery, and the client needs to know as soon as they receive a packet that changes it so they can reset their decoder state to correctly parse the packet)

This change deliberately does not attempt to provide a generalised parser for all the fields in the RTP Header to keep this PR small and focused just on the essential fields.

…nd SSRC. Refactored so that instead of pure Buffer, we now send AudioPacket (interface extending Buffer) which has readonly fields sequence, timestamp, and ssrc.
@vercel
Copy link

vercel bot commented Feb 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
discord-js Skipped Skipped Mar 4, 2026 2:01pm
discord-js-guide Skipped Skipped Mar 4, 2026 2:01pm

Request Review

@vercel vercel bot temporarily deployed to Preview – discord-js February 27, 2026 23:11 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide February 27, 2026 23:11 Inactive
@petergeneric petergeneric changed the title Voice audio packets: Pass through RTP header values for Sequence, Timestamp, Sync Source feat: Voice audio packets: Pass through RTP header values for Sequence, Timestamp, Sync Source Feb 27, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

📝 Walkthrough

Walkthrough

The changes extend audio packet handling in the voice receiver to include RTP metadata (sequence number, timestamp, and SSRC). A new public AudioPacket interface is introduced to expose this metadata, and the VoiceReceiver now wraps decrypted packets with these fields before streaming. Tests validate metadata extraction and backward compatibility.

Changes

Cohort / File(s) Summary
AudioPacket Interface Definition
packages/voice/src/receive/AudioReceiveStream.ts
Introduces new public interface AudioPacket extending Buffer with readonly properties for RTP sequence (16-bit), timestamp (32-bit), and SSRC (32-bit) identifiers.
Packet Metadata Wrapping
packages/voice/src/receive/VoiceReceiver.ts
Adds internal createAudioPacket helper function to attach RTP metadata as non-enumerable properties to decrypted buffers. Extracts sequence and timestamp from incoming UDP packet headers and wraps packets before streaming.
Metadata Extraction Tests
packages/voice/__tests__/VoiceReceiver.test.ts
Adds two new tests validating RTP metadata extraction from desktop/mobile RTP packets and confirming backward compatibility with existing packet handling.

Sequence Diagram

sequenceDiagram
    actor UDP as UDP Source
    participant VR as VoiceReceiver
    participant CAP as createAudioPacket
    participant ARS as AudioReceiveStream
    
    UDP->>VR: onUdpMessage (encrypted packet)
    Note over VR: Extract RTP sequence,<br/>timestamp from bytes
    VR->>VR: Decrypt packet payload
    VR->>CAP: createAudioPacket(buffer,<br/>sequence, timestamp, ssrc)
    CAP->>CAP: Attach metadata as<br/>non-enumerable properties
    CAP-->>VR: AudioPacket (Buffer + metadata)
    VR->>ARS: stream.push(AudioPacket)
    Note over ARS: Consumer receives<br/>Buffer with metadata
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description check ✅ Passed The description clearly explains the motivation (RTP headers were previously stripped), the solution (AudioPacket interface with sequence, timestamp, ssrc fields), and the rationale for scope limitation.
Title check ✅ Passed The title states 'make voice receivers return AudioPacket classes instead of just Audio', but the actual change introduces an AudioPacket interface (not a class) that wraps RTP header metadata into decrypted audio packets. The title is partially related but uses imprecise terminology ('classes' vs interface) and doesn't capture the core purpose: exposing RTP header fields (sequence, timestamp, ssrc).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/voice/src/receive/AudioReceiveStream.ts`:
- Around line 39-41: Update the documentation for the AudioPacket buffer to
state it contains an Opus-encoded payload (with RTP header metadata) rather than
a "decoded Opus packet"; locate the AudioPacket type/comment in
AudioReceiveStream (or the AudioPacket JSDoc) and change the wording to
explicitly say "a Buffer containing an Opus-encoded payload with RTP header
metadata" so API consumers aren't misled.

In `@packages/voice/src/receive/VoiceReceiver.ts`:
- Around line 177-180: The RTP header reads in VoiceReceiver (variables
sequence, timestamp, ssrc reading from msg) can throw for 9–11 byte buffers;
change the early length guard to require at least 12 bytes (e.g., if (msg.length
< 12) return;) or move these reads inside the existing try that wraps
parsePacket so any RangeError is caught; update the check or relocate the reads
in the VoiceReceiver.ts function that computes sequence/timestamp/ssrc to ensure
no unhandled RangeError occurs.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0cb8be4 and f04d08b.

📒 Files selected for processing (3)
  • packages/voice/__tests__/VoiceReceiver.test.ts
  • packages/voice/src/receive/AudioReceiveStream.ts
  • packages/voice/src/receive/VoiceReceiver.ts
📜 Review details
🧰 Additional context used
🧬 Code graph analysis (2)
packages/voice/src/receive/VoiceReceiver.ts (1)
packages/voice/src/receive/AudioReceiveStream.ts (1)
  • AudioPacket (42-59)
packages/voice/__tests__/VoiceReceiver.test.ts (1)
packages/voice/__mocks__/rtp.ts (3)
  • RTP_PACKET_DESKTOP (7-16)
  • RTP_PACKET_CHROME (18-26)
  • RTP_PACKET_ANDROID (28-37)
🔇 Additional comments (2)
packages/voice/src/receive/VoiceReceiver.ts (1)

23-30: createAudioPacket wrapper is clean and backward-compatible.

Non-enumerable readonly metadata on Buffer is a good approach for preserving legacy buffer behavior.

packages/voice/__tests__/VoiceReceiver.test.ts (1)

71-110: Great coverage additions for metadata passthrough and compatibility.

These tests validate both RTP header field extraction and Buffer backward compatibility across multiple packet variants.

 - Improve docstring use (also moved method to be private static to be more in-line with rest of code, and improved clarity of naming)
 - Fix pre-existing issue (min packet length was 8 bytes, but was expecting reading a uint32 at offset 8, so actual min length is 12)
 - Fix AudioPacket description
@vercel vercel bot temporarily deployed to Preview – discord-js February 27, 2026 23:37 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide February 27, 2026 23:37 Inactive
@petergeneric petergeneric changed the title feat: Voice audio packets: Pass through RTP header values for Sequence, Timestamp, Sync Source feat: pass rtp timestamp data through for audio packets Feb 27, 2026
@petergeneric petergeneric changed the title feat: pass rtp timestamp data through for audio packets feat: pass RTP timestamp data through for audio packets Feb 27, 2026
@vercel vercel bot temporarily deployed to Preview – discord-js-guide February 27, 2026 23:46 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js February 27, 2026 23:46 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js February 28, 2026 01:07 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide February 28, 2026 01:07 Inactive
Comment on lines 153 to 158
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're already touching RTP headers I noticed that this is no longer correct. We have the full packet here, so this should check the eXtension bit (4th bit of the first byte) of the header instead of looking for magic bytes of the header extension data. And it even looks like this is stripping the wrong amount of header data, unless I'm missing something here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right, it'll need to consider the case where the CC field is > 0.

I've been working on this. Started as using the X bit but it's now a larger change (parsePacket now parses RTP Header fields, and passes payload start pos into decrypt rather than it recomputing. As a result the unit tests call parsePacket rather than directly calling decrypt; I also took the opportunity to change some buffer variable names for better clarity). See 7575a25

Are you happy with that being part of this PR, or should it be split out into a separate PR?

…r.ts

Co-authored-by: Qjuh <76154676+Qjuh@users.noreply.github.com>
@vercel vercel bot temporarily deployed to Preview – discord-js March 1, 2026 12:11 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide March 1, 2026 12:11 Inactive
…d fixing style to use hyphens. Also move addPacketHeaders to a bare function below the class per review comment
@vercel vercel bot temporarily deployed to Preview – discord-js March 1, 2026 12:18 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide March 1, 2026 12:18 Inactive
Minor naming changes as part of this because we are working with two Buffers (raw RTP packet, decrypted RTP/DAVE payload).
Change tests in VoiceReceiver.test.ts that were directly testing `decrypt` to instead test `parsePacket`
…er than an extension of Buffer. This breaks backwards-compatibility for existing AudioReceiveStream users, but is cleaner and allows for future extensibility.
@vercel vercel bot temporarily deployed to Preview – discord-js-guide March 1, 2026 16:15 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js March 1, 2026 16:15 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide March 1, 2026 17:36 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js March 1, 2026 17:36 Inactive
… Documented constructor as not a public interface to discourage end-users from using it and breaking if we extend AudioPacket in the future
@vercel vercel bot temporarily deployed to Preview – discord-js March 3, 2026 08:56 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide March 3, 2026 08:56 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js March 4, 2026 14:01 Inactive
@vercel vercel bot temporarily deployed to Preview – discord-js-guide March 4, 2026 14:01 Inactive
@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

❌ Patch coverage is 89.28571% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 31.70%. Comparing base (0cb8be4) to head (fef397d).
⚠️ Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
packages/voice/src/receive/VoiceReceiver.ts 85.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11432      +/-   ##
==========================================
+ Coverage   31.66%   31.70%   +0.03%     
==========================================
  Files         386      386              
  Lines       13966    13977      +11     
  Branches     1098     1099       +1     
==========================================
+ Hits         4422     4431       +9     
- Misses       9410     9412       +2     
  Partials      134      134              
Flag Coverage Δ
brokers 11.71% <ø> (ø)
voice 55.63% <89.28%> (+0.21%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@petergeneric
Copy link
Author

FYI: code coverage fail is misleading, this is existing lack of coverage (these lines are just reorganisations of the code that was already there to make it easier to understand). It looks like there's no test data for DAVE

@vladfrangu
Copy link
Member

For testing sake, can you draft up a quick code sample that can be used to see how it works now? (bonus point, we can document the breaking change)

@petergeneric
Copy link
Author

@vladfrangu Sure, here's a sample snippet that'll join a voice channel and subscribe to all speakers, is this what you were thinking of?

import type { VoiceBasedChannel } from 'discord.js';
import {
	joinVoiceChannel,
	entersState,
	VoiceConnectionStatus,
	EndBehaviorType,
	type AudioPacket,
} from '@discordjs/voice';

async function joinAndRecordChannel(channel: VoiceBasedChannel) {
	// Join the voice channel
	const connection = joinVoiceChannel({
		channelId: channel.id,
		guildId: channel.guild.id,
		adapterCreator: channel.guild.voiceAdapterCreator,
		selfDeaf: false, // We want to receive audio
		selfMute: true, // We don't want to send audio
	});

	// Wait for the connection to be ready
	await entersState(connection, VoiceConnectionStatus.Ready, 30_000);

	const receiver = connection.receiver;

	// Subscribe to all users when they speak
	const subscribedUsers = new Set<string>();
	receiver.speaking.on('start', (userId) => {
		if (subscribedUsers.has(userId)) return;
		subscribedUsers.add(userId);

		// Subscribe to the user's audio stream
		const audioStream = receiver.subscribe(userId, {
			end: { behavior: EndBehaviorType.Manual },
		});

		// New API: 'data' event now emits an AudioPacket rather than a Buffer
		audioStream.on('data', (packet: AudioPacket) => handleOpusPacket(userId, packet));

		audioStream.on('error', (err) => {
			console.error(`Audio stream error for ${userId}:`, err);
		});
	});
}

function handleOpusPacket(userId: string, packet: AudioPacket) {
	console.log(
		`User ${userId}: rtp.ssrc=${packet.ssrc} rtp.seq=${packet.sequence} rtp.ts=${packet.timestamp}@48kHz opus=${packet.payload.length}B`,
	);
}

(a better example might show how to use the new fields in opus decoding to produce a fully synchronised PCM from all speakers, but I've extracted this from my transcription bot which defers opus decode to an offline process to minimise load while recording)

Here's an initial draft of the breaking change doc entry (have tried to match general style from the Changes in v14 .mdx):

AudioReceiveStream

AudioReceiveStream now emits AudioPacket instances instead of raw Buffers. Each AudioPacket contains the Opus payload along with relevant RTP header metadata (sequence, timestamp, and ssrc).

The Opus audio data can be found in the payload property:

- // Previously, each chunk was a raw Buffer of Opus data:
- receiver.subscribe(userId).on('data', (chunk) => {
- 	processOpusData(userId, chunk); // chunk was a Buffer
- });

+ // Now, each chunk is an AudioPacket with payload and RTP metadata:
+ receiver.subscribe(userId).on('data', (packet: AudioPacket) => {
+ 	processOpusData(userId, packet.payload); // Opus data is on .payload
+ 	console.log(packet.sequence);    // RTP sequence number
+ 	console.log(packet.timestamp);   // RTP timestamp (48kHz clock)
+ 	console.log(packet.ssrc);        // RTP synchronization source
+ });

@vladfrangu
Copy link
Member

That is perfect, thanks!

@vladfrangu vladfrangu changed the title feat: pass RTP timestamp data through for audio packets feat!: make voice receivers return AudioPacket classes instead of just Audio Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants